
Hello to everyone,

here is the data for 4 stocks traded at the NYSE: BOEING, DISNEY, IBM and
EXXON.
The data is for volume durations.
For each stock, and for each filtering volume (25=25000, 50=50000, ...), I
send you 2 files: the .dat files and .fmt files. They are equivalent (I
send both because Joachim told me that he preferred .dat files).
Of course, there is much more in the data than the volume durations. The
files contain info on market microstructure variables that we do not use in
this paper. For those using the .fmt files, the following structure is used:

Le fichier final comprend 16 colonnes:
1 -index (bid/ask quotes)
2 -duration (bid/ask quotes)
3 -bid
4 -ask
5 -signe du changement: +1 ou -1
6 -volume cumul des trades sur la duration (rmq: 6=7+8)
7 -volume cumul de transactions ask sur la duration
8 -volume cumul de transactions bid sur la duration
9 -sx (tod standardized duration)
10 -nombre de transactions ask sur la duration
11 -nombre de transactions bid sur la duration
12 -(nombre total de transactions sur la duration) standardis = trading
intensity standardise
13 -(volume total/nbre transactions) standardis = average
volume/transaction standardis
14 -(volume total/nbre transactions) non standardis =  average
volume/transaction non standardis
15 -average spread sur la duration
16 -(volume total/duration) standardis = volume intensity standardis

Please remember that the volume durations are defined as the time between
two bid-ask quotes for which a volume of at least V_c has been traded.

When V_c is large (for example 100,000 shares), volume durations get rather
large too and it is not clear if we are still in the framework of
"intraday" durations. For example, on average they are around 45 minutes
for most stocks...

I do not know if we are going to apply the models to raw durations (not
filtered). What do you think?

Pierre